Search results for "Distance measures"
showing 10 items of 15 documents
Comparison of Bayesian and numerical optimization-based diet estimation on herbivorous zooplankton
2020
Consumer diet estimation with biotracer-based mixing models provides valuable information about trophic interactions and the dynamics of complex ecosystems. Here, we assessed the performance of four Bayesian and three numerical optimization-based diet estimation methods for estimating the diet composition of herbivorous zooplankton using consumer fatty acid (FA) profiles and resource library consisting of the results of homogeneous diet feeding experiments. The method performance was evaluated in terms of absolute errors, central probability interval checks, the success in identifying the primary resource in the diet, and the ability to detect the absence of resources in the diet. Despite …
The Attentional Demand of Automobile Driving Revisited: Occlusion Distance as a Function of Task- Relevant Event Density in Realistic Driving Scenari…
2014
Objective: We studied the utility of occlusion distance as a function of task-relevant event density in realistic traffic scenarios with self-controlled speed. Background: The visual occlusion technique is an established method for assessing visual demands of driving. However, occlusion time is not a highly informative measure of environmental task-relevant event density in self-paced driving scenarios because it partials out the effects of changes in driving speed. Method: Self-determined occlusion times and distances of 97 drivers with varying backgrounds were analyzed in driving scenarios simulating real Finnish suburban and highway traffic environments with self-determined vehicle speed…
Distance Functions, Clustering Algorithms and Microarray Data Analysis
2010
Distance functions are a fundamental ingredient of classification and clustering procedures, and this holds true also in the particular case of microarray data. In the general data mining and classification literature, functions such as Euclidean distance or Pearson correlation have gained their status of de facto standards thanks to a considerable amount of experimental validation. For microarray data, the issue of which distance function works best has been investigated, but no final conclusion has been reached. The aim of this extended abstract is to shed further light on that issue. Indeed, we present an experimental study, involving several distances, assessing (a) their intrinsic sepa…
Statistical classification and proportion estimation - an application to a macroinvertebrate image database
2010
We apply and compare a random Bayes forest classifier and three traditional classification methods to a dataset of complex benthic macroinvertebrate images of known taxonomical identity. Since in biomonitoring changes in benthic macroinvertebrate taxa proportions correspond to changes in water quality, their correct estimation is pivotal. As classification errors are passed on to the allocated proportions, we explore a correction method known as a confusion matrix correction. Classification methods were compared using the misclassification error and the χ2 distance measures of the true proportions to the allocated and to the corrected proportions. Using low misclassification error and small…
Classification Similarity Learning Using Feature-Based and Distance-Based Representations: A Comparative Study
2015
Automatically measuring the similarity between a pair of objects is a common and important task in the machine learning and pattern recognition fields. Being an object of study for decades, it has lately received an increasing interest from the scientific community. Usually, the proposed solutions have used either a feature-based or a distance-based representation to perform learning and classification tasks. This article presents the results of a comparative experimental study between these two approaches for computing similarity scores using a classification-based method. In particular, we use the Support Vector Machine as a flexible combiner both for a high dimensional feature space and …
Fuzzy Data Fusion for Real-World Mapping Using 360° Rotating Ultrasonic Sensor
1997
Abstract Mobile robot perception of the external environment is limited by the features of the used sensor. An useful technique used to improve robot perception is data fusion. This paper presents an approach to build a map of an unknown environment applying fuzzy data fusion methods to data acquired through an ultrasonic sensor. Conditioning of these data and motion control of the mobil robot by fuzzy data fusion are also described. The resulting two dimensional map is used for path planning and navigation. The proposed approach is exrperimentally tested using real distance measures acquired by a 360° rotating sensor.
CUDA-Accelerated Alignment of Subsequences in Streamed Time Series Data
2014
Euclidean Distance (ED) and Dynamic Time Warping (DTW) are cornerstones in the field of time series data mining. Many high-level algorithms like kNN-classification, clustering or anomaly detection make excessive use of these distance measures as subroutines. Furthermore, the vast growth of recorded data produced by automated monitoring systems or integrated sensors establishes the need for efficient implementations. In this paper, we introduce linear memory parallelization schemes for the alignment of a given query Q in a stream of time series data S for both ED and DTW using CUDA-enabled accelerators. The ED parallelization features a log-linear calculation scheme in contrast to the naive …
GEM
2014
The widespread use of digital sensor systems causes a tremendous demand for high-quality time series analysis tools. In this domain the majority of data mining algorithms relies on established distance measures like Dynamic Time Warping (DTW) or Euclidean distance (ED). However, the notion of similarity induced by ED and DTW may lead to unsatisfactory clusterings. In order to address this shortcoming we introduce the Gliding Elastic Match (GEM) algorithm. It determines an optimal local similarity measure of a query time series Q and a subject time series S. The measure is invariant under both local deformation on the measurement-axis and scaling in the time domain. GEM is compared to ED and…
Adapted Transfer of Distance Measures for Quantitative Structure-Activity Relationships and Data-Driven Selection of Source Datasets
2012
Quantitative structure–activity relationships are regression models relating chemical structure to biological activity. Such models allow to make predictions for toxicologically relevant endpoints, which constitute the target outcomes of experiments. The task is often tackled by instance-based methods, which are all based on the notion of chemical (dis-)similarity. Our starting point is the observation by Raymond and Willett that the two families of chemical distance measures, fingerprint-based and maximum common subgraph-based measures, provide orthogonal information about chemical similarity. This paper presents a novel method for finding suitable combinations of them, called adapted tran…
Weighted Least-Squares Likelihood Ratio Test for Branch Testing in Phylogenies Reconstructed from Distance Measures
2005
A variety of analytical methods is available for branch testing in distance-based phylogenies. However, these methods are rarely used, possibly because the estimation of some of their statistics, especially the covariances, is not always feasible. We show that these difficulties can be overcome if some simplifying assumptions are made, namely distance independence. The weighted least-squares likelihood ratio test (WLS-LRT) we propose is easy to perform, using only the distances and some of their associated variances. If no variances are known, the use of the Felsenstein F-test, also based on weighted least squares, is discussed. Using simulated data and a data set of 43 mammalian mitochondr…